We are motivated by the need for impromptu (or as-you-go) deployment ofmultihop wireless networks, by human agents or robots; the agent moves along aline, makes wireless link quality measurements at regular intervals, and makeson-line placement decisions using these measurements. As a first step, we haveformulated such deployment along a line as a sequential decision problem. Inour earlier work, we proposed two possible deployment approaches: (i) the pureas-you-go approach where the deployment agent can only move forward, and (ii)the explore-forward approach where the deployment agent explores a fewsuccessive steps and then selects the best relay placement location. The latterwas shown to provide better performance but at the expense of more measurementsand deployment time, which makes explore-forward impractical for quickdeployment by an energy constrained agent such as a UAV. Further, thedeployment algorithm should not require prior knowledge of the parameters ofthe wireless propagation model. In [1] we, therefore, developed learningalgorithms for the explore-forward approach. The current paper provides deploy-and-learn algorithms for the pure as-you-goapproach. We formulate the sequential relay deployment problem as an averagecost Markov decision process (MDP), which trades off among power consumption,link outage probabilities, and the number of deployed relay nodes. First weshow structural results for the optimal policy. Next, by exploiting the specialstructure of the optimality equation and by using the theory of asynchronousstochastic approximation, we develop two learning algorithms thatasymptotically converge to the set of optimal policies as deploymentprogresses. Numerical results show reasonably fast speed of convergence, andhence the model-free algorithms can be useful for practical, fast deployment ofemergency wireless networks.
展开▼